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We compare results of previous simulations of a simple model of DNA denaturation to the 
predictions of the Poland-Scheraga model. Concentrating on the critical region of the latter model we 
calculate both thermodynamic quantities and the distribution functions measured in the simulations. 
We find that the Poland-Scheraga model yields an excellent fit to the data, provided (i) we include a 
(singular) factor weighting the open ends of the doubly stranded chain, and (ii) we keep the leading 
corrections to the finite size scaling limit. The exponent ci, which governs the end- weighting factor, 
is fairly well determined: 0.1 < ci < 0.15. The exponent c, which governs the length distribution of 
large loops, is determined only poorly. The data are compatible with values of c in at least the range 
1 .9 < c < 2.2. From the data it therefore cannot be decided whether the denaturation transition 
asymptotically is of first or of second order. We suggest that simulations of doubly stranded chains 
closed at both ends might allow for a more precise determination of c. 

I. INTRODUCTION AND OVERVIEW 

Recent work [1-4] reports on extensive simulations of a simple model for the DNA denaturation transition, i. c. 
the thermal unbinding of the two strands of the DNA helix. In this model the two strands are represented by self 
avoiding walks of length on a (hypercubic) lattice. The walks also are mutually avoiding, but with some essential 
qualification. They start from the same origin and they are allowed to occupy the same lattice point, provided the 
distances of this point from the origin, as measured along the two strands, coincide. Such an overlap of 'complementary 
base units' is weighted by a factor e e > 1, where —ksTe represents a binding energy which in DNA is due to hydrogen 
bridges. 

Clearly this model is only a rough caricature of a real DNA molecule. In particular it ignores the helical structure of 
the doubly-stranded parts as well as the chemical heterogeneity of DNA, which yields a base-pair dependent binding 
energy. However, the model fully includes the large scale properties of the embedding of the chain molecule into real 
space, which are governed by the excluded volume. The results of the simulations support the common view that 
there exists a sharp unbinding transition which in the limit of infinitely long strands, N — > oo, becomes a true phase 
transition. Furthermore it is suggested [1] that this transition is of first order, with a jump in the density of bound 
base pairs, which is the order parameter of the transition. For some other observables, nontrivial power law scaling 
has been found [1-4]. 

On the theoretical side the standard model for the DNA denaturation transition has been proposed by Poland 
and Scheraga long ago [5,6]. It concentrates on the internal configuration of DNA, treated as a linear sequence of 
doubly stranded parts and singly stranded loops and easily allows for the incorporation of the sequence heterogeneity 
of real DNA molecules. The Poland Scheraga (P.S.-)model has become the standard tool for analyzing denaturation 
experiments [7,8]. Such experiments give direct access to the order parameter of the transition mentioned above. 
The data typically are analyzed in terms of 'melting curves', defined as the derivative with respect to temperature of 
the average number of bound pairs. These curves may show some pronounced sequence-dependent structure, which 
can well be reproduced by the P.S. -model. However, the model is essentially one-dimensional. It involves only the 
'chemical distance' along the chain. The embedding into three-dimensional real space shows up only in some entropic 
weight assigned to the loops. A loop consisting of two strands of length £ is weighted by a factor a£~ c , where the 
'cooperativity parameter' a governs the density of loops along the DNA, and the exponent c is calculated from the 
probability that two long walks, (£ ^> 1), starting from the same point meet again after £ steps. 

The numerical value of the loop exponent c turns out to govern the asymptotic (N — > oo) nature of the denaturation 
transition. With < c < 1 we for small a in the melting curves may see some strong peak, but no proper phase 
transition exists. For 1 < c < 2 there is a second order transition, and for c > 2 the transition is of first order. 
Originally the loops were taken as closed random walks [5], which yields c = d/2, where d is the dimension of the 
embedding space. It immediately was pointed out [9] that self-avoidance within the loops changes the loop exponent 
to c = vd, where v is the correlation length exponent of the self-avoiding walk problem. It takes values v — 3/4(d = 2), 
or v k, 0.588(<i = 3), respectively. More recently Kafri et. al. [10] invoked the theory of polymer networks to argue that 
the excluded volume interaction between the self-avoiding loop and the other parts of the DNA molecule increases 
c above 2 both in two and three dimensions. For d = 3, a value c > 2.1 was suggested. Furthermore, the scenario 
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suggests that open strands of length to, 1 <C m <C N, at the ends of the denaturating DNA should be weighted by 
a factor m~ Cl , with C\ w 0.1 for d = 3. Further extensions of this approach have been presented [3], which will be 
addressed below. (See Sect. II B.) 

Analyzing experimental melting curves within the framework of the P.S. -model one always takes c\ — 0, implying 
that no nontrivial end effects exist. The loop exponent typically is choosen as c w 1.75, close to c = vd, d = 3. The 
success of the analysis, -though most remarkable, - cannot be taken as experimental support of this choice, since 
the melting curves are not sufficient to unambiguously fix the large number of parameters involved in applying the 
model to a real DNA molecule [6-8]. It recently has been shown [11] that melting curves for a given base sequence 
calculated within the P.S.-model both with c = 1.75 or c = 2.15 can be brought on top of each other by adjusting 
the cooperativity parameter a. For c = 2.15, a has to be increased by a factor of order 10 compared to its value for 
c = 1.75. Since independent information on a is missing, the experiments leave the value of c and thus the asymptotic 
character of the transition undetermined. 

This situation asks for a closer examination of the results of the simulations described above. Clearly, if the P.S.- 
model accurately captures the physics of DNA denaturation, then it has to reproduce quantitatively the results of the 
simple lattice model, where the complicated chemical microstructure associated with many fit parameters is absent. 
The present work therefore tries to answer three related questions. 

(i) Is a model of P.S. -type able to consistently fit all the relevant simulation results? 

(ii) Does the analysis fix the exponents c, Ci? 

(Hi) If the data are compatible with a range of exponents, what other observables able to fix the exponents are in 
reach of present day computer experiments? 

The first question is not trivially answered. In view of the experimental situation described above we clearly expect 
the theory to provide a good fit to the simulated melting curves, but the computer experiments provide much more 
detailed information. In particular, the distribution functions of both the number of bound pairs and the length 
of the loops have been measured. It is thus not self-evident that we find a positive answer to question (i) : if we 
include some nontrivial weighting of the open ends the thus generalized P.S.-model reproduces all the published data 
curves remarkably well. We stress that we here refer to the whole functions, not just to asymptotic power laws. We 
will find that the chain length in the simulations by far is too small to reach asymptotic power law scaling. The 
answer to question (ii) is somewhat ambiguous. The analysis clearly shows the need of an end-weighting exponent 
0.10 < ci < 0.15, but all observables related to the distributions of the number of bound pairs or the length of the 
open ends reasonably well can be fitted with loop exponents in the range 1.7 < c < 2.3. It is only the measured loop 
length distribution which in the region of smaller £ clearly favors a value c > 2. Since, however, the loop weight a£~ c 
by construction is meant to hold for I ^> 1, fixing a precise value of c is somewhat a matter of taste. Insisting on 
fitting the data down to I w 10, a value of c = 2.05 can be extracted. However, assuming that corrections to the 
asymptotic loop weight die out only for I > 100, (we will give some arguments supporting this view), we can fit the 
data with c in the range 1.9 < c < 2.2. 

In view of these results, consideration of question (Hi) is of interest. The trivial answer that longer chains must 
be simulated seems to be unrealistic. Changing the boundary conditions should be a more realistic option. In the 
available simulations the two strands are bound together at one end, whereas the other end of the 'DNA chain' may 
be, and in general will be, open. Comparing to results for chains bound together of both ends may considerably 
restrict the allowed range of values of c. 

With the restricted chain lengths of the simulations, (Rcf. [1]: 500 < N < 3000; Refs. [2-4]: 80 < N < 1280), our 
analysis necessarily concentrates on finite size effects in the P.S.-model. However, the normal finite size scaling limit 
is not sufficient. This limit consists in taking N to infinity, with other variables, properly scaled by powers of N, 
held fixed so as to magnify the critical region. In our analysis we need to keep the leading correction terms to this 
limit, which for N — > oo vanish as ' c 2 1 . Since in any case |c — 2| is small, for the chain lengths considered these 
corrections are not negligible. Indeed, they explain the deviations of the effective exponents previously extracted from 
the data [1,3,4] from the exponents c, ci introduced in the model. 

This paper is organized as follows. In Sect. II we define our version of the P.S.-model, and we derive expressions 
for the observables of interest. Sect. Ill is devoted to the finite size scaling limit. In Sect. IV we take up question (i), 
showing that with exponents c = 2.05, Ci = 0.13 and keeping the leading corrections to the finite size scaling limit 
a good fit of the simulation data is found. In Sect. V we consider the range of exponents consistent with the data, 
(question (ii)), and other observables sensitive to c, (question (in)), are discussed in Sect. VI. Sect. VII summarizes 
our conclusions. 
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II. OBSERVABLES OF THE POLAND SCHERAGA MODEL 



The P. S. -Model describes the internal configuration of DNA as an alternating sequence of doubly stranded parts, 
(lengths j^), and loops, (single strand lengths £ fl ). The weight of such a configuration is written as a product of 
weights V {in) for the doubly stranded parts and weights U (i^) for the loops. A detailed analysis of the dependence 
of the phase transition on the asymptotic behavior of V (j), U {£) recently has been presented in Ref. [12]. We here 
for V (j) take the original choice of Poland and Scheraga: 

V{j)=vP . (2.1) 

The parameter w absorbs both energetic and entropic effects. U{£) is taken to be of long range and will be specified 
below. 

Finite size effects in one dimensional systems with long range interactions are sensitive to the boundary conditions. 
For the present problem two types of boundary conditions are to be considered, which in the sequel will be distinguished 
by indices (bb) or (bu), respectively. With (66)-boundary conditions the two strands are bound together at both ends 
of the (doubly stranded) chain, whereas (fru)-boundary conditions allow for unbinding of the two strands at one end. 

In the first part of this Section we derive general expressions for the partition function and related quantities. For 
our specific choice of U {£) and of a factor p (to, N) weighting open ends of length to more explicit expressions are 
presented in the second part. The third part of this section is devoted to some distribution functions that have been 
measured. 



A. General structure of the partition function and related quantities 

We first consider (66)-boundary conditions. For strands of total length N the partition function reads 

oo k k 

z * w = e e s ( N j ° e + w y° n ■ ( 2 - 2 ) 

k=0 {Jo.3i.--.3fc>> 1 A»=l M=l 

Here 5 (n) stands for Kronecker's symbol <5 nj o. We define generating functions 

G(A)=E^, (2-3) 

oo 

U(\) =X>"^W • (2-4) 

£=1 

The explicit form (2.2) of Zm (N) results in 

(2.5) 

Analysis of G (A) is sufficient to exhibit the asymptotic behavior, (N — ► oo) , of the model. Being interested in finite 
N, we have to invert the transform (2.3). 

Z bh {N) = j>^\ N G(\) (2.6) 

The integration extends over a closed path encircling all singularities of G (A) in the complex A-plane. 
The general ansatz for U (£) reads 

U(£) = a(£- C +U s (£)) , (2.7) 

where U s {£) decreases faster than £r c and corrects the asymptotic power law for smaller £. It is easily checked that the 
leading term a£~ c in the generating function U (A) gives rise to cut extending from A = to A* = 1. (See Subsect. II B 
for the explicit analysis.) G (A) inherits the cut and may show poles in addition. Specifically, for w sufficiently large 
there exists a pole at Ai > 1. For c > 1 this pole merges with the branch point A* = 1 at a critical value w* given by 



G(A) 



w 



■U{\) 



3 



-L-1-W(l) = 0. (2.8) 

This equation locates the critical point of the phase transition. Exploiting the positivity of U(i) it is easily checked 
that all other poles X p of G(X) obey |A P | < max (l,Ai) and therefore contribute negligibly to the behavior of long 
chains. 

In our analysis we will restrict ourselves to the critical region, defined by \w — w* \ = O (N^^) , 

4> = min(l,c- 1) . (2.9) 

Up to terms of order 1/N, which we neglect, it then is found that all A-integrals are dominated by the neighborhood 
of the branchpoint A* = 1 : |A — 1| = O (1/N) . In the sequel we will use this information in simplifying the general 
expressions. The partition function immediately yields the density p^ of bound pairs: 

ft 1 

*"5S h *-5^' (2 - 10 » 

Being interested only in the critical region we in Eq. (2.11) replaced a factor X/w by 1/w*. The fluctuation in the 
number of bound pairs 

Cb6 =iH w ^) inZbb =w" 6 (2 - i2) 

for shortness will be addressed as the melting curve. In the critical region it is related to the experimental melting 
curve via a factor relating w — w* to the deviation of the temperature from its critical value. 
Another quantity of interest is the density of loops. It is defined as 

(2 - i3 » 

where 

oo k k 

z [ b L h ] (N) = Y, k E ^N-jo-Y.^+^^Ui 11 ^^) ■ ( 2 - 14 ) 

fe=l {30.31. ■•.3fc}> 1 n = i a»=i 

In terms of the generating functions Z$ takes the form 

Z ^ (iV) = / 2Sa A ^ (A)G2(A) ' (2 ' 15) 

Since in the critical region A = 1 + O (1/N) , we can replace the factor U (A) by U (1) = 1/w* — 1, (cf. Eq. (2.8)) to 
find 

Z^(N) = (l-w*)Z^(N) . (2.16) 

We thus have the relation 

P§ = (l-w*)p bb , (2.17) 

correct up to terms of order N~^. 

This simple result has some interesting consequences. It suggests a method for a precise determination of the critical 
point w* and furthermore allows us to identify the critical region as that range where the ratio Pbb/ p\ L b essentially 
stays constant. This ratio has a simple interpretation. Up to a 1/N correction it is the mean length of the doubly 
stranded parts connecting the loops. Eq. (2.17) thus implies that during the denaturation transition not the average 
length but the average number of the doubly stranded parts decreases. Essentially pairs of loops combine to form 
larger loops. The mean length of the loops takes the simple form 
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w»=W 1 ( 1 -^) = i^(^- 1 ) • ^ 



(£) bb , as given by Eq. (2.18), is the mean length averaged over all loops on the chain. A priory it differs from the 
mean length of a specific loop, e.g. the first one. The latter is defined as 

z [ti] 

(h) bb = -f- , (2.19) 

^bb 

oo k k 

4 l] w = E E S ( N Jo E V* + ^) n ( w 

fe=l {J0.31.--,3fc}>l |U=1 M = l 

= -^— t-^-\"G(X)(-\4rU(\)) . (2.20) 

In the last line we again replaced some term X/w by 1/w*. We further note that in principle the definition of (£i) bb 
involves the probability that at least one loop exists on the chain. However, this probability differs from 1 only by 
negligible terms. In contrast to (£} bb , (ti) bb is not simply related to p bb . Physically the difference arises from the fact 
that (£) gives stronger weight to short loops, which are numerous on the chain. We will find that measuring {li) 
would allow for a more precise determination of the loop exponent c. 

All the above results easily are transferred to (fru)-boundary conditions. The partition function takes the form 

N-l 

Z bu (N)=Y,P K N) Z bb (N-m) , (2.21) 

m— 1 

where the factor p (m, N) weights open ends of length m. We recall that such a weight is absent in the original 
formulation of the P.S. -model. The generalization to p(m,N) ^ 1 will turn out to be quite important. We further 

note that we omitted the closed configuration m = 0, which carries negligible weight. Quantities Z^, Z^ , Z b ^ 
are related to their (66)-counterparts in complete analogy to Eq. (2.21). In the critical region the relation among the 
density of loops and the density of bound pairs, (cf. Eq.(2.17)), is preserved: 

P l bu = 0-~W*)pbu ■ 

The only, but quite trivial, change concerns the mean length of a loop, averaged over all loops. It takes the form 

where (m/N) is the mean length of an open end: 

Q = ^ N f mp{m ,N)Z bb{ N- m) . (2.23) 



B. Explicit expressions valid in the critical region 

The form (2.7) of U (I) yields the generating function 

U{X)=a(ci c (\\ +U S {\)\ , (2.24) 



where 



l _ l 

a . W -^j*t*£- (2.25) 
o 
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is the polylogarithmic function and T (c) denotes the Gamma function. Clearly £i c (l/A) has a cut for < A < 1, 
but no other singularities. Expressions like Eq. (2.6) show that the contribution of the cut to the A-integral for long 
chains is dominated by the region 1 — A = O (-^) . In the critical region also the relevant pole Ai approaches 1. We 
thus write 



A = 1 + 



N 



(2.26) 



\c-l 



and we expand U (A) in powers of x/N. Since the exponent c is expected to be close to c — 2, we keep powers (x/N) 
and (x/N) 1 . The order of the neglected terms depends on the asymptotic behavior of U s (I) . For U s (£) <~ l~ c , c > c, 
the leading neglected term behaves as (x/7V) mm( - 2 ' c ' c ~ 1 - ) . 

In expanding U (1 + x/N) we have to distinguish among 2 < c < 3 and 1 < c < 2. (We will ignore the case c = 2, 
where logarithmic corrections show up.) 



(i) 2 < c < 3 
We find 



U(l + -) = U(l) + a -— ^ ( 

\ NJ K ' T c)sin7rc V 



d_ 

dX 



r (c) sin 7rc V N 
W a (A)-C(c-l) 



x 
N 



(2.27) 



where ( (z) denotes the Zeta function. We now introduce a scaled 'temperature like' variable r as 

w = w* I 1 + 



a T N 



(2.28) 



where the microstructure dependent amplitude a T will be given below. Substituting Eqs. (2.27), (2.28) into 
Eq. (2.5) for G (A) and exploiting the definition (2.8) of w* we find 



G- 1 (A) = - - 1 - U (A) = — - — \x-f- ^N 2 -^ - 1 } 



(2.29) 



where 



l + w*a[ C(c-l)- — 



U a {\) 



ai 



T (c) sin 7rc 



(2.30) 
(2.31) 



(a) l < c < 2 

The expansion of U (1 + x/N) yields 



U 



"JV 



T (c) (2 - c) dX 



U 8 {\) 



(2.32) 



where / stands for the integral 



/ = 



r(c) 



dy — 



c -! „.c-l 



(-ln(l-y)) c - 



This time introducing r as 
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w = w* ( 1 



a T N- 



c-l 



(2.33) 



we find for G (A) 



where 



G- 1 (A) 



c-l 
w*oT(2-c) 



ai = — a T + a T w a 



N 1 



w*a T 



[x ' 1 - f - ai N c - 2 x] 



d_ 

dX 



Us (A) + — — 

i (2 - c) T (c) 



Using the exponent </>, Eq. (2.9), we can combine the results for 2 < c < 3 and 1 < c < 2 in the form 



G (A) =w*a T N' l 'G(x) , 

G(x) = \x* - f - ai N-\ c - 2 \x c -* 



(2.34) 
(2.35) 



(2.36) 
(2.37) 
(2.38) 



Turning now to Z bb , Eq. (2.6), we note that with A = 1 + x/N we can replace the factor X N by e x . Thus, to the 
order considered here we find 



Z bb (N) =w*a T N*- 1 Z bb (T,N) 
Z bb (f,N) = l^-e x G{x) . 



The density of bound pairs takes the form 



Pbb 



= a T N*- 1 p bb (f,N) 



The melting curve is found as 



Pbb(f,N) = —lnZ bb (f,N) . 



C bb = a 2 T N 2 *- 1 C bb (f , N) 
d 

:Pbb(f,N) . 



Cbb (f,N) 

The mean length of the first loop, Eqs. (2.19), (2.20), involves a factor 

Calculating this factor for c > 2 or c < 2 from the expressions given above we find 

1 1 — a T 



df' 



(2.39) 
(2.40) 

(2.41) 
(2.42) 

(2.43) 
(2.44) 



a T 1 — w* 1 — w* a T 



c - 1 oi N 2-c Z bb 



A,b 



, c>2 , 



or 



r - 1 1 7 [tl 



1 ai + a 7 



1 — w* a T 



Z bb a T 1 — w* 



c<2 



(2.45) 



(2.46) 



7 



respectively. We introduced the function 

Z^{f-N) = j^-e* X ^G{ X ) , (2.47) 

occuring in both expressions. 

Turning now to (6u)-boundary conditions we have to specify the weight p (to, N) of the open ends. We recall 
that p (to, A) equals 1 in the original formulation of the P.S. -model, but the network scenario suggests [13] that 
open ends of length 1 <C m <C A should be weighted by a factor <~ m~ Cl . Furthermore, as has been pointed out 
in Ref. [3], also configurations 1 <C A — to -C N, where the closed part is small compared to the dangling ends, 
should be weighted by a factor (A — to) Cl . As entropic weight for the end pieces we thus choose 

p(m,N) =m- Cl (N- m y Cl . (2.48) 

Some comment on this choice seems appropriate. It is found that with (6u)-boundary conditions in the critical 
region typically a sizeable part of the chain is open, so that using an asymptotic expression for p (m, A) seems 
justified. However, the asymptotic power laws do not uniquely fix the form of p (to, N) . The form (2.48) is just 
the simplest ansatz. It has the virtue of introducing no additional parameters. We further should note that it 
has been argued [3] that the power laws in m or (A — to) might involve different exponents. This 'co-polymer 
network' scenario was motivated by an attempt to reproduce the measured effective exponents. Our analysis 
reveals that all effects discussed in this context can be traced back to the corrections to the finite size limit, 
embodied in the terms proportional to ' 2 c I , (see Sect. IV). Thus there is no need of introducing more 
exponents, and no experimental evidence for a 'co-polymer network' scenario is left [14]. 

Having specified^ (m, A) we can calculate Z bu from Eq. (2.21). We note that Z bb (A — to) is given by Eqs. (2.39), 
(2.40), with the only change that the factor e x in Eq. (2.40) is to be replaced by e x ( 1 - ,n ) ^ where 

771 

m=-. (2.49) 

Replacing the summation over to by integration over to, we immediately find 

Z bu (A) = w*a T N^ 2ci Z bu (f , A) , (2.50) 
l 

Z bu (f, A) =Jdm mT cl (1 - fh)- Cl j> ^- e x{1 -^G (x) . (2.51) 

o 

The corresponding expression holds for Z^. All the other results of the present subsection stay unchanged, 
with the index (66) replaced by (6m). 

C. Distribution functions 

The results of the previous subsection immediately yield the distribution of the length to of the open ends. 

V l bu > (m,N) =p(m,N) . (2.52) 

Z bu (iV ) 

Eqs. (2.39), (2.40), (2.49-2.51) result in 

V£ ] (™,N) =±pW(m,f,N) , (2.53) 

^^ ^2^X1 ^^- (2 - 54) 

Also the distribution of the loop length, averaged over all loops, is easily calculated. Up to normalization it for 
(66)-boundary conditions is defined as 
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X bb (£,N) = J2 E 6 ( N - 3o - E & + ^) 

/j-— i /j.— i 

The standard analysis yields in the critical region 

X bb (£, N) = (w*a T ) 2 N 2 *- X U (£) 0(1-1) X bb (I, f , N) , (2.56) 

X bb (£, f , N) = j^- e (^>G 2 (x) , (2.57) 



where 



(2.58) 



and 9 (z) is the step function. The normalization takes the form 

E Xbb (i, N) =w*(l- w*) alN^Xhh (0, f , N) . (2.59) 

e 

We thus find the loop length distribution 

For (fru)-boundary conditions X bb is to be replaced by 

X bu (I,f,N)= J dmm- Cl (l-m)- Cl £ ^cxp ((I- m-i)x)G 2 (x) . (2.61) 
o 

Analysis of the distribution of the number of bound pairs is somewhat more involved. For (66)-boundary conditions 
it is defined as 



<&bb (iV ) 



OC K 

y bb (n,N) = E E S ( N ~ 30 E & + ^) ) 



fe=0 {30.- -.Jfc}> 1 



R ft! 

^(n-E^)^° II ( W (^)^ M ) 



/j— fi— 1 

J4b again can be evaluated by introducing a generating function, resulting in 



y bb (n,N)=w r - 



dX 
2niX' 



-X 



N-r 



l+U(X) 



(2.63) 



(2.64) 



The integral encircles the cut in the A-plane. We are interested in the range n» 1, N — n 3> 1, since only there we 
can expect to find results independent of microscopic details of the model. In this region the integral is dominated 
by the branch point A* = 1. We thus again introduce variables f and x via Eqs. (2.36), (2.26), and we use Eq. (2.8) 
to write 



l+U(\) = \ 

w* 



»-(w(i)-w(i + £)) 
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We furthermore introduce the scaled variable 



Eq. (2.64) takes the form 



w 



y bb (n,N) = — [l 



11 AT 

n = —N 



a T N<t> 



dx f x \ 

i-«;*(w(i)-w(i + ^)) 



Ar(l-a T fiAf*- 1 )-l 
a T nN*-l 



Using the expansion of U (l + j^) , and invoking N 3> 1 we find 



y bb {n,N) = — e Tn y bb (n,N) , 



dx 

y bb (n, N) = (f> — exp 



-x^n + a^N-^x^ 



Taking into account the normalization Z bb (N) , Eq. (2.39), we find as final result 



£> bb \J,N) 



(2.65) 



(2.66) 

(2.67) 
(2.68) 

(2.69) 
(2.70) 



For (fru)-boundary conditions Eqs. (2.69), (2.70) hold with the index (66) replaced by (6m), where y bu (n, N) is defined 

as 



l-a T nN*~ 1 



y bu (n,N) 



dx 



J dm m Cl (1 — m) C1 J ex ( ! - in ) .r 



—nx 



+ ai nN-^x c - 



(2.71) 
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III. FINITE SIZE SCALING LIMIT 

Taking the limit N — > oo with the scaling variables f,n,l,fh held fixed just amounts to dropping the terms 
aiiV~l c ~ 2 la; c ~^ in all expressions of the previous section. We consider the cases c > 2 or c < 2 separately. 



A. c> 2 : 4> = 1 

For N — > oo the contribution of the cut vanishes in the expressions for the partition function, Eqs. (2.40), (2.51). 
For all f only a simple pole survives. We thus for (bb) boundary conditions find the simple result 

Z bb (f) = lim Z bb (f,N) = e f . (3.1) 

JV^oo 

This yields a constant density of bound pairs, 

Pbb = a T , (3.2) 

and vanishing fluctuations, 

C bb = . (3.3) 
Indeed, the distribution of the number of bound pairs degenerates to a 5-function. 

Hr ] (n,N) = ±s(a T -%) (3.4) 

To derive Eq. (3.4) some closer inspection of Eq. (2.68) for y bb (n, N) is necessary. 

/dx r 
— exp x (1 - n) + cunN-^-^x - 1 
2m L 

1 n 

n = T7 ■ 

a T N 

We first note that a T , Eq. (2.30), obeys the inequality 

< a T < 1 . (3.5) 

This is immediately obvious from Eq. (3.2) and formally derives from the properties of the generating function IA (A) . 
Thus also oi, Eq. (2.31), is positive. A priory the integral for y bb extends over the edge of the cut x < 0, and 
for n/N > a T the term proportionally to a\ is necessary to guarantee convergence. Since c > 2 we, however, can 
transform the integration contour to the imaginary axis, where the limit N — > oo can be taken. The result 

y bb (n) - lim y bb (n, JV) = 5 (1 - n) 

N^oo 

follows. 

For the distribution of loop lengths, Eq. (2.60), the finite size scaling limit is also easily evaluated. 

Ht L] (»■ N) — - U (l) 6 (1 - I) (1 - 1) e"* (3.6) 

N^oo 1 — W 

Eqs. (2.17), (3.2) yield a constant loop density, = (1 — w*) a T . Also the mean length of the first loop, Eq. (2.45), 
tends to a constant: 

, , 1 l-a T , 
<*>» = a r • (3 ' 7) 

It in fact becomes identical to the mean length averaged over all loops, Eq. (2.18). 

In summary, we have found that for c > 2 in the finite size scaling limit the internal structure of the chain bound 
together at both ends is independent of f in the whole critical region — oo < f < oo. A priory this is somewhat 
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surprising. The explanation is provided by Eq. (3.6), which for I > 0, r = const, N ^> 1 properly should be written 



as 

Ht L] ~ N- C r c {I -I) e- lf . 
The total length contained in loops I > rj for any 77 > becomes non-negligible only for 

f ~ — In N ► —00 . 

TV^oo 

Thus denaturation occurs outside the finite size scaling limit as defined above. 

If the chain is allowed to open from one end we find a completely different behavior. Eq. (2.51) yields 

1 

Z bu {f) = J dm(m(l-m))~ Cl e^- m ^ f 


= V^T (1 - Cl ) r^-V (f 12) , (3.8) 
where I v (z) is the modified Bessel function. The density of bound pairs and the associated fluctuations take the form 

Pbu = a T pbu (t) 



with scaling functions 



C bu = a z T NC bu (f ) 



1 1 Ja_ ci (t/2) 

^^ = 2 + 2/rfw ^ 

r 1 ^Wl^lV i-c! y ci (r/2) 

Cbtt(T);= i"ilw?72)J -^w?72)- (3 - 10) 



We note that Cbu (t) is symmetric in r : 



Cbu (-r ) = C* hu (t ) 



With the choice c\ = 0.11 suggested by the network scenario the scaling functions are plotted in Figs, la or 2a, 
respectively. It must be stressed that in view of the results for (66)-boundary conditions the observed f-dependence 
is due only to the progressive prolongation of the dangling ends. It is easily checked that p bu (f ) is related to the 
average length < m/N > of the open ends as 

Pbu(f) = l-Q) . (3.11) 

Also the distribution of the number of bound pairs is easily evaluated. Following the steps explained above in the 
context of the evaluation of y b b we find 

^ P] (n,f) = lim P b ^(n,f,N) 

N^oo 

fn 

^((l-n)n)- Cl 6(l-n)^— . (3.12) 

Z bu (t) 

Thus this distribution at the critical point f = 0, (Fig. 3a), just reflects our choice of p{m,N) . "P^f P ' is directly 
related to the scaling function of the length distribution of the dangling ends: 

n„(n,f)=p|f ] (l-n,f) . (3.13) 

Thus for f = the length of the open ends fluctuates strongly. For c\ = its distribution would be completely flat. 
The choice c\ > induces a weak preference of essentially open or essentially closed configurations. 

To summarize, we have found a first important result. For (6u)-boundary conditions and c > 2 denaturation in 
the finite size scaling limit is completely due to the opening of the end. The closed part of the chain is just a passive 
spectator, undergoing no relevant change in its loop structure. The numerical value of c > 2 therefore is irrelevant. 
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B. l<c<2: 4> = c — 1 



In the region of the second order transition the contribution of the cut survives the finite size scaling limit, and the 
results become less trivial. Eq. (2.40) yields 

Z *b (j) = i — : - =l 



2ni x c — t 
sin 7rc 



dx- 



J (x c_1 cos7rc + r) 2 + (x c 1 sin7rc) 2 
o 



e (f) _2-o 

H — t c— 1 exp 

c — 1 



For (6u)-boundary conditions we find from Eq. (2.51) 
Z bu (f) = 



sin7rc wi J, a; c + c i- 3 / 2 e-*/ 2 / 1/2 _ Cl (x/2) 
1 (1 — ci) / ax 



+ ^M^Fr(i- Cl )f 

c — 1 



(x c 1 cos 7TX + t) 2 + (x c 1 sin 7rc) 2 



^(3/2-c + c l)exp iri 7l 



(3.14) 



(3.15) 



Clearly the resulting scaling functions p (f ) and C (f ) have to be evaluated numerically. (As a technical remark we 
note that for f ~ it is preferable to include in the integration contour for x a small circle centered at x = 0, so as 
to avoid the numerical complications of the pole merging with the branch point.) The results for c = 1.75, c\ = are 
shown in Figs, lb or 2b, respectively. We see some quantitative effect of the boundary conditions but qualitatively 
the behavior is not changed. 

However, an analysis of the bond number distribution again shows a strong sensitivity to the boundary conditions. 
On the technical side we note that the integral (2.68) for y>bb (n, TV), c < 2, best is evaluated along a path of constant 
phase in the x— plane. In polar coordinates (r, -0) this path for N — > oo is parameterized as 



r(ip,n) 



. sin (c — 1) V> 



sin?/; 



(3.16) 



and y^b (n) takes the form 



y bb (n) 



(c-l)n [ A „ . 
(2-c)ttJ 



dtpr c 1 (tp, n) 



sin (2 — c) tp 



o 

• exp 



nr c {ip,n) 



sin?/> 

sin (2 — c) tp 



sin?/> 



(3.17) 



A similar expression is found for J4 U (n) . For r = 0, c = 1.75, c\ = the resulting bond-number distributions are 



shown in Fig. 3b. The qualitative difference among vlf P ' and reflects the fluctuations in the length of the 

dangling ends, which again are very strong. Evaluating Eq. (2.54), for c < 2, c\ — 0, f = 0, — > oo, we find the 
simple result 



[BP] 



bu 



(m,0) = (c- 1) (1-m) 



c-2 



(3.18) 



Again this distribution is quite flat, and the most probable configuration is an essentially open chain. Thus also for 
c < 2 the denaturation transition strongly is driven by the opening of the chain ends. 

We finally recall that the density of loops just follows the density of bound pairs and thus strongly decreases 
with decreasing f , whereas the mean length averaged over all loops, Eq. (2.18), increases. (£) is also sensitive to N. 
Evaluating p^b for f = 0, N — > oo, we find 



p bb (w*,N) = a T N' 



c-2 



r(c-i) 

T(2c-2) ' 



(3.19) 
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resulting in 

N 2 - c r(2c-2) , . 

This is to be compared to the mean length of the first loop, which from Eq. (2.46) is found as 

(^ b = ^^ y r(c),, = 0. (3.21) 
As expected, (£i) asymptotically exceeds (£) . Specifically for c = 1.75 we find 

Urn ^ « 1.27 . 
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IV. A FIT TO THE RESULTS OF SIMULATIONS 



The simple model introduced and simulated in Ref. [1] briefly has been described in the introduction. We recall 
that it uses (&it)-boundary conditions for the two strands of the chain and involves a single parameter e € weighting the 
overlap of complementary base pairs. Except for allowing for such overlaps the excluded volume is fully incorporated. 

In Ref. [1] data for the density of bound pairs pb u and the specific heat e 2 Cb u are shown as function of e for single 
strand lengths up to N = 3000. In addition, for three values of e close to the estimated critical value e* = 1.3413 the 
distribution functions "P^ P ' of the number of bound pairs are given. Analysis of these distributions, in particular, 
led the authors to conclude that the transition is of first order, but sizeable corrections to finite size scaling exist. 

For the same model data on the loop length distribution "P^ L ' for chains up to length N = 1280 at e = 1.3413 have 
been presented in Refs. [3,4]. The authors extract a value of c = 2.14±0.04. Ref. [3] also provides some information on 

r p] 

the distribution of the length of the dangling ends, V bu , and on the chain length dependence of the partition function 
Zbu- Other data presented in Refs. [3,1] concern spatial properties like the distance distribution within a loop, or refer 
to the phase diagram in the limit where the chain fills a nonvanishing fraction of the volume. Such aspects are outside 
the frame of the present work. 

Inspection of the simulation results immediately shows that even for the longest chains the data cannot be re- 
produced by the finite size scaling limit of the P.S. -model. In particular, the measured distribution "P^ P ' (n/N) , 
e = 1.3413, only vaguely resembles the theoretically predicted "Pj;f P ', f ~ 0, c > 2, (Fig. 3a), and differs completely 
from vl^ P \ c < 2, (Fig. 3b). Whether including the corrections ~ ai-/V~l c ~ 2 l allows for fitting the data is the topic 
considered here. For given exponents (c, ci) this introduces a\ as a second fit parameter besides a T . A third and 
quite important fit parameter is the critical value e*. Clearly e e is proportional to the parameter w introduced in the 
P.S. -model. In the critical region we write 

4=e-^l + e-e*. 
w* 

so that Eq. (2.36) for the scaling variable f yields 

f = a T (e-e*)A^ , (4.1) 

Since with the present data takes values up to N 1 = 3000, even changing e* by 0.0001 will have some effect. Note 
that in P|f P ', f occurs in the argument of an exponential function, (cf. Eq. (2.70)). 

In fitting we proceed as follows. For given (c, Ci) we determine a T ,ai,e* by fitting simultaneously for N = 500, 
3000 the results for the average bond number pb u as function of e, given in Fig. 7a of Ref. [1]. We choose pb u , since 
it shows the smallest statistical error. Analysis of the loop length distribution involves the prefactor w*cr / (1 — w*) , 
(cf. Eq.(2.60)), which is determined by fitting the results of Fig. 1, Ref. [4]. With these parameters known, the more 
directly interpretable parameters w* , a, U (1) , W (1) = d/dXU (A) |i can be calculated. Concerning the numerical 
evaluation of our expressions the essential technical points have been mentioned in the previous section. 

We now present our results for the choice (c, c\) — (2.05,0.13) . These values are deep in the range of exponents 
which allow for a reasonable fit, as discussed in the next section. The fit uses parameters e* = 1.34110, a T = 0.2775, 
a K = 0.6746, w*a/ (1 - w*) = 1.0, leading to w* w 0.87, a w 0.14, U s (1) w -0.60, U' s (1) w -0.46. We note that 
e e /w* plays the role of an effective coordination number of the lattice walk. The value e e /w* w 4.4 resulting from 
the above parameters seems quite reasonable for the cubic lattice used in the simulations. In fact it is quite close to 
the effective coordination number w 4.68 measured for a simple self-avoiding walk on that lattice. 

Fig. 4 shows the average bond density and the melting curves as function of (e — e*) N for N — 3000, 1000, 500. We 
omitted data for other chain lengths, so as not to overload the plots. For all N the fits are of the same quality as those 
shown. Note that in panel b) we have plotted Cb u /N, since this is the quantity approaching a finite size scaling limit. 
We clearly can state excellent agreement among theory and data. Only for N = 3000 we observe a small shift of the 
maximum of the calculated melting curve relative to the data. However, this deviation is within the error bars, (cf. 
Fig. 8 of Ref. [1]), which we have suppressed here. The scaling limits, N — > oo, arc included in Fig. 4 as broken lines. 
Obviously our fit implies that we are far from the scaling limit. With corrections decaying only as A~( c ~ 2 ) = Af -0 ' 05 , 
this is no surprise. 

We now turn to the distribution of the number of bound pairs, shown in Figs. 5, 6 of Ref. [1] for values e = 1.3413, 
e = 0.999 • 1.3413 = 1.33996, e = 1.001 • 1.3413 = 1.34264. We first note that this distribution by construction of the 
simulation model factorizes according to 
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a property also valid in the P.S.-model, (cf. Eq. (2.70)). Thus 

V^ P] {n,ei,N) fei _ ea)n . . 

H B u P] (n,e 2 ,N) ' (43) 

We have checked that the data fulfill this relation within about 3% deviation. Thus measurements with different e 
essentially carry the same information. Still, changing e changes the weight associated with different regions of n/N. 
We therefore in Fig. 5 show fits to data for the largest and the smallest value of e. We note that we did not show 
the results of the theory for n < 30. From the analysis of the finite size scaling limit we know that this region is 
dominated by essentially open chains with closed parts of length < 100. For such short parts we expect the subleading 
corrections neglected here to become relevant. With this qualification we can state full agreement among theory and 
data. This strongly suggests that the P.S.-model indeed captures the essential physics of the problem. 

Fig. 6 again illustrates that the present interpretation of the data implies that we are far from the finite size scaling 
limit, ft compares at the critical point f — as calculated for N = 3000, (full line), to the finite size scaling limit, 

(long dashes). The result for N = 3000 decreases monotonically, showing no precursor of the singularity developing 
for N — > oo at n/N = a T . In the figure we also included the result for e = 1.3413, TV = 3000, (f rts 0.1665) , (short 
dashes). It shows a shallow maximum near n/N = 0.4, which clearly is not related to the asymptotic singularity, but 
rather is due to the exponential prefactor exp (nf ) . This maximum has also been observed in the simulations, and 
the present interpretation contradicts that given in Ref. [1]. 

We next consider the distribution of loop lengths. Data for e = 1.3413 and different chain lengths are presented 
in Ref. [3], Fig. 7, and Ref. [4], Fig. 1, as essentially continuous curves, from which we have drawn some points. 
We note that for the larger chain lengths the data curves for loop lengths I > 100 are rather noisy. We therefore 
for N = 1280 represent the data by a set of vertical bars, which give an impression of the range in which the 
data fluctuate. Fig. 7 shows our fit for chain lengths N — 160, 320, 1280, using only the long range part <r£~ c 
for the factor U (£) in Eq. (2.60). Even taking into account that this plot is doubly logarithmic, we find the fit 
quite remarkable. Note that we have chosen the exponent c = 2.05. Still the theory in some intermediate range 
10 i 100 yields an effective exponent c ss 2.14, as quoted in Ref. [4]. Obviously the variation of the factor Xb u (J) 
in the expression for the loop length distribution is quite essential. Evaluating this factor for f = in the finite 
size scaling limit one finds that the asymptotic exponent c within an error of about .02 can be extracted only from 
a region 1 <C i < 10~ 2 iV. Clearly, even for N = 1280 such a region does not exist. We also note that the present 
fit yields values \U S (1) | s=s 0.60, \U' S (1) | w 0.46, that are fairly small compared to their long range counterparts 
Y^Li^~ c = C(2-05) w 1.60, J2T=i il ~ c = C (1-05) w 20.6. This is consistent with the observation that the fit works 
down to very small values of I. 

For the distribution V hu (m,N) of the length of the dangling ends data have been presented only for the two- 
dimensional version of the simulation model. (See Fig. 4 of Ref. [3].) Two distinct power laws have been extracted: 

[£] f m-^,c\ = 0.23 ± 0.01; m« N 

bu \ ' \(N-m)- c \c 2 = 0.35 ± 0.01; N - m < N . { ' 

Both exponents differ from c\ = 9/32 rts 0.28, predicted by the network scenario for d = 2. For d — 3 Ref. [3] quotes 
values c'i = 0.14 ± 0.01, c 2 = 0.16 ± 0.01. As mentioned in Sect. II B the occurrence of two different power laws led 
the authors to suggest a 'co-polymer network' scenario, assuming that critical exponents for the bound part of the 
chain differ from those of the dangling ends. 

Evaluating our expression for for finite N we find that effective exponents defined as in Eq. (4.4) always obey 
the relation c\ < c\ < c 2 . Indeed, using exponents c = 77/32, ci = 9/32, predicted by the simple network scenario in 
two dimensions, we easily can reproduce the measured effective exponents for the range of chain lengths considered 
in Ref. [3]. Turning to three dimensions we for the choice of parameters employed in this section in Fig. 8 show 

doubly logarithmic plots of V^} as function of m/N or 1 — m/N, respectively. Pressed to extract effective exponents 
we would quote c'i rts 0.12, c 2 rts 0.17, not far from the simulation results quoted above. In particular, c 2 definitely 
is larger than the value c\ = 0.13 used and is in the range quoted from the simulations, c'i is a little smaller than 
expected, but we doubt that such a small deviation is significant. 

To support the co-polymer scenario, in Ref. [3] also the partition function itself has been measured. At the critical 
point Z (N) is expected to behave as 

Z(N) ~ A^'-V^ , (4.5) 

where the simple network scenario predicts 7* w 2.06 in three dimensions, whereas with exponents c'i, c 2 extracted 
from V bu the co-polymer scenario leads to 7* rs 2.00. The authors argue that their data favor the latter value. (See 
Fig. 8 of Ref. [3].) Within the philosophy of the network approach we would write the partition function at f = as 
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(4.6) 



an expression which for N — > oo reduces to Eq. (4.5). Evaluating the correction factor Zf, u (0, N) with the exponents 
and parameter values used in this section we find that in the range 100 < N < 1000 it decreases the effectively 
measured exponent by about 0.06, thus bridging the gap between 7* = 2.06 and the value 7* = 2.00 advocated in 
Ref. [3]. In summary, we in this section have shown that the P.S. -model, slightly generalized by a simple end-weighting 
factor, allows for a consistent and excellent fit of the available Monte Carlo data. Keeping the leading corrections 
to the finite size scaling limit is essential. For the chain length measured these corrections explain all the observed 
effective exponents, and without these corrections a quantitative fit of the data within the framework of the P.S. -model 
is impossible. 
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V. ESTIMATE OF THE RANGE OF EXPONENTS COMPATIBLE WITH THE DATA 



In analyzing experimental melting curves usually exponents (c, Ci) = (1.75,0) are chosen. The network scenario 
relates c, Ci to exponents governing the partition function of three- armed star polymers. With the most recent 
estimates [15] for these exponents one finds (c, ci) = (2.15,0.11). We here first examine, whether these two sets of 
exponents allow for a reasonable fit. 

In Fig. 9 we show the melting curves for chains of length N = 500, 3000. Clearly, the network based exponents 
(2.15, 0.11) yield a fit of the same quality (full lines) as found in the previous section. The choice (1.75, 0), (long 
dashes), however, fails to capture the chain length dependence of the data. Indeed, with an appropriate choice of a T , 
ai, e* each individual melting curve can be reproduced reasonably well. We here have determined the parameters by 
fitting p bu (N = 3000) . But the parameters extracted definitely depend on N. This eliminates the choice c = 1.75, 
Ci = 0. Further analysis shows that this failure is not related to the value of c but to c\ — 0. The short dashed curves 
in Fig. 9 are calculated with exponents (1.75, 0.11), which again allow for a reasonable fit. 

Keeping c\ — 0.11 fixed but varying c we in the range 1.7 < c < 2.3 have found acceptable fits for all quantities 
related to the distribution of the number of bound pairs. The change in c essentially is compensated by a change of the 
cooperativity parameter a, which varies from a (c = 1.7) s=s 0.03 to a (c = 2.3) s=s 0.45. This observation is completely 
consistent with the result of Ref. [11]. The question to which extent the data for the loop length distribution restrict 
the value of c will be discussed below. 

We now first consider the range of C\. Fig. 10 shows the distribution of the number of bound pairs for e = 1.34264, 
N = 500. The curves give theoretical results for c = 2.15, 0.07 < C\ < 0.20. As discussed earlier, we do not put 
too much weight on the range of small n (n < 50, i. c. n/N < 0.1) , where we expect subleading corrections to become 
relevant. Fig. 10 suggests that c\ should be chosen in the range 0.10 < c\ < 0.15. This is consistent with all other 
data. In particular, the height of the maxima in the melting curves is quite sensitive to c\. (Recall the discussion 
in the context of Fig. 9.) It increases with increasing c\, and reasonable fits need a value somewhere between 0.07 
and 0.15. Fig. 3a suggests an explanation of this behavior: increasing C\ puts stronger weight on essentially closed or 
essentially open configurations and makes the transition sharper. We finally note that the estimate 0.10 < c\ < 0.15 
is quite independent of the value of c chosen. 

We now turn to the determination of c. The estimate 1.7 < c < 2.3 quoted above is based on fits of the melting 
curves. For smaller iV the tail to the right of the maximum with increasing c slowly changes from undershooting the 
data to overshooting. Fig. 9b shows an indication of this effect. The other quantities related to the density of bound 
pairs are fairly insensitive to c, except that increasing c suppresses the initial peak in P ' (n, N) for n < 50, i.e. in 
the region where 1/n corrections should become relevant. 



The loop length distribution might be expected to be more sensitive to c. Fig. 11 shows log 10 (^ 2 05 'Pf )tI {£, N)\ as 



function of log 10 £, where in the theoretical curves we again replaced the explicit factor U {£) by a£~ c , cf. Eq. (2.60). 
We omitted the region log 10 I > 2.5, where the theoretical curves for all c in the range 1.7 < c < 2.3 essentially 
coincide. We also omitted the data for N = 1280, since the accuracy rapidly decreases for log 10 £ > 2, where these 
data deviate from those shown. Besides the curves for (c, ci) = (2.05,0.13), (full lines, cf. Fig. 7), we included the 
results for (1.90, 0.11), (long dashes), and (2.20, 0.11), (short dashes). For given N the curves merge for t > 10 2 . For 
smaller I the theoretical curves as function of c sweep over the data. The behavior is consistent with the parameters 
U s (1) , U' s (1) extracted, which show that for c = 1.90 the here neglected part U s (£) is predominantly positive, whereas 
it is predominantly negative for c = 2.20. We furthermore recall from the general discussion of Sect. II that the first 
moment (£} bu of can be expressed in terms of p bu , (m) bu , and 1 — w*, cf. Eq. (2.22). All the parameter sets used 

here essentially yield the same results for these quantities and thus for (£) bu ■ This suggests that for each parameter 

set we could find a short-range part U s (£) of U (£) , so that the data for v)j^ are fitted consistently together with all 
other quantities. An example is shown in the insert of Fig. 11 and will be discussed below. In view of the above, the 
problem of fixing a range of c from data for amounts to identifying a value £q such that U s {£) <C £~ c for £ > £ - 

For estimating £ the literature offers two somewhat contradictory pieces of evidence. Ref. [3] presents results of exact 
enumeration on the square lattice, for chain lengths N < 15. The data seem to obey the scaling law vf^ ] (£, N) = £r c 
V (t/N) with c = 2.44 ± .06. The network scenario predicts c w 2.41 in two dimensions. These results suggest £ n w 10, 
and accepting the same value in three dimensions we would conclude that c is very close to c = 2.05. The network 
scenario predicts c = 2.15, an estimate based on simulations of star polymers [15] with arm lengths up to n — 4000. 
However, previous work [16] covering only arm lengths n < 130 resulted in somewhat different effective exponents, 
which yield a value c 2.10. Thus, in an optimistic view we might take the good fit with c = 2.05 as supporting the 
network scenario. This value could be interpreted as an effective exponent describing loop lengths of order £ w 10 2 . 
Consistent with the above mentioned results the effective exponent is smaller than the asymptotic value c s=s 2.15. 
Clearly this interpretation implies that U s {£) is a sizeable correction up to at least £ ~ 10 2 . This is quite plausible, 
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since the corrections to asymptotic scaling in the excluded volume problem in three dimensions are known to decrease 
roughly like £ -0 ' 5 . The insert in Fig. 11 demonstrates that such corrections easily can bring the theoretical curves to 
match the data also for c ^ 2.05. (In two dimensions the corrections are expected to decrease like 1/N, which might 
explain the observation of Ref. [3] quoted above.) 

However, irrespective of the validity of any specific choice of c these considerations imply that U s (£) cannot be 
expected to be negligible for £ < 10 2 . Whatever the true value of the exponent c might be, we must expect that 
embedding of the chain into three-dimensional space gives rise to corrections typical of the excluded volume problem. 
In a less optimistic view we thus would conclude that c might take any value in about the range 1.9 < c < 2.2, these 
values resulting from the assumption that £q is of the order of 100. Even though the choice c = 2.15 based on the 
network scenario allows for a consistent interpretation of the data, we from the analysis of the data have no reliable 
arguments to exclude other values of c. 
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VI. ANALYSIS OF OTHER OBSERVABLES 



According to our analysis the present data are compatible with exponents c in at least the fairly large range 
1.9 < c < 2.2. For a more precise determination the obvious approach would be to simulate longer chains. For 
instance, with chains of length N = 30000 the predicted height of the maximum of the melting curve increases by 
about 25% in going from c = 1.9 to c = 2.2. However, it seems unlikely that for such long chains sufficiently precise 
simulations can be carried through in the near future. We thus should look for other observables sensitive to c. 

Two features of the transition prevent a precise determination of c. Firstly, the smallncss of |c — 2| forces us to 
include the corrections to finite size scaling with the associated fit parameter a\. Only the simulation of much longer 
chains could eliminate these terms, which blur the qualitative difference among c < 2 or c > 2. Secondly, with the 
(few)-boundary conditions used the chain essentially opens from the end, irrespective of c. For instance, for f = on 
average about half of the chain is open for all c in the above range, irrespective of N. Clearly, this feature suppresses 
the sensitivity to c of the transition. To eliminate this effect we should switch to (66)-boundary conditions, where 
both chain ends are closed. 

Fig. 12 shows the density of bound pairs and the melting curves predicted with (66)-boundary conditions for 
N = 500, (c, ci) = (2.20, 0.11) or (1.90, 0.11) , respectively. The parameters e*,a T ,ai are taken from the fit to the 
(fat)-data. We note that the transition is shifted towards negative values of (e — e*) N, as expected. What is more 
important, even for such short chains we sec a clear effect of changing c. For pbb the effect just increases with N. For 
Cbb/N the peak height for c = 1.90 decreases faster than for c = 2.20. Near N = 1000 there is a region where the 
peak heights approximately coincide and where an experimental discrimination among the two predictions will be 
more difficult. 

In Fig. 13 we compare predictions for the mean length of the first loop. Panel a) shows results for (fru)-boundary 
conditions. For N — 500 the effect of changing c is quite small, but it increases rapidly with N. Data for N = 3000 
could improve the estimate of c considerably. As panel b) shows, for (66)-boundary conditions the effect again is 
increased, to the level where data for N = 500 could be as useful as data for N = 3000 and (fru)-boundary conditions. 

These findings suggest that a more precise determination of the exponent c would be possible by comparing the 
results for (bu)- or (&6)-boundary conditions. Such simulations could be carried through for fairly short chains, since 
a clear effect is predicted even for N = 500. 
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VII. CONCLUSIONS 



We have found that a model of the type proposed by Poland and Scheraga is able to quantitatively reproduce 
simulation data of a very simple model of the DNA denaturation transition. It, however, turned out to be essential 
to amend the original P.S. -model by a factor p (m, N) weighting the unbound ends of the doubly stranded chain. The 
simple ansatz p (m, N) = m~ Cl (N — m)~ Cl is sufficient, but without such a factor the model cannot reproduce the 
chain length dependence of the data. Concerning the application of the P.S. -model to the analysis of physical melting 
curves, this may be the most important result found here. 

Concerning the simulation data we have found that an interpretation within the framework of the P.S. -model implies 
that chain lengths N < 3000 are far too short for asymptotic finite size scaling to hold. We must include corrections 
to this limit which decay only as 7V~I C_2 I. Since the numerical value of |c — 2| is quite small, this implies that all 
the effective exponents extracted by straight forward data analysis are strongly influenced by the correction terms. 
In particular this holds for the exponents which motivated the co-polymer network scenario. For the dangling ends 
this scenario introduces a weight p (m, N) involving two different exponents. Our analysis lends no support to this 
hypothesis. 

Allowing for unbinding of an end of the doubly stranded chain we have found that denaturation is dominated by the 
prolongation of the dangling ends. The internal loop structure of the closed part of the chain does not change much 
during the transition. As pointed out in Sect. Ill, for c > 2 this is true irrespective of N, but for the chain lengths 
considered here it holds true also for c < 2. We, for instance, note that for (c, c\) = (1.75, 0.11) , N — 3000, the mean 
length of a loop according to the theory increases only from {i) bu ~ 4.7 to (£) bu ~ 7.1 in a range where the average 
length of the open ends increases from (m) = 0.1 N to (m) = 0.9 N. This observation is of immediate consequences, 
if we try to determine the exponents (c, c\) from the data, ci, which governs the weight of the dangling ends, can 
be determined with fair precision. We found 0.10 < c\ < 0.15. The loop exponent c, however, at best is bounded by 
1-9 ^ c < 2.2. An even larger range results if we allow for more pronounced short range effects in the ansatz for the 
loop weight U (£) . In full agreement with previous work [11] we find that changes of c essentially are compensated by 
changing the cooperativity parameter a. 

Excluding simulation of much longer chains we see two possibilities to decrease the uncertainty of c. We found that 
the mean length (£{) of the first loop is sensitive to c, even if we allow the chain end to unbind. For N w 3000 the 
effect should be measurable reasonably well. Another, and possibly more efficient, approach would be to simulate 
chains bound together at both ends. This eliminates the pathway dominating denaturation for unbound ends. The 
theory predicts measurable effects of changing c even for N w 500. Furthermore such boundary conditions have the 
additional virtue of suppressing the factor p (m, N), which is not well known quantitatively. 

In summary, the results of the present work suggest that the Poland-Scheraga model generalized by an end- weighting 
factor catches the essential physics of the denaturation transition. Since the network scenario predicts such a factor 
we feel that the analysis also supports this scenario with its associated exponents (c, c\) = (2.15, 0.11) . However, more 
work must be done to put this latter conclusion on a firmer basis. 
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b) 

FIG. 1. Scaling function p of the density of bound pairs as function of r. a) ptu for ci = 0.11, c > 2. b) ptu (full line) and 
pbb (broken line) for ci = 0, c = 1.75. 
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FIG. 2. Scaling functions C of the fluctuations in the number of bound pairs, (melting curves), a) C'b u for ci = 0.11, c > 2. 
b) Cbu (full line) and Cbb (broken line) for ci = 0, c = 1.75. 
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FIG. 6. V bu as predicted by the P.S.-model. See the main text for a discussion. 
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FIG. 9. Melting curves C [bu] /N as function of r = N (e - 1.34114), where e* = 1.34114 has been determined with 
(c, ci) = (2.15, 0.11). Panel a): TV = 3000, panel b): N = 500. Data are taken from Ref. [1]. The curves give the results 
of the P.S. -model for exponents (2.15, 0.11), full lines; (1.75, 0), long dashes; (1.75, 0.11), short dashes. 
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FIG. 10. V l b ^ P] {n/N) for N = 500, e = 1.34264. Data from Ref. [1]. The theoretical curves are calculated for c = 2.15 and 
values ci = 0.07, short dashes; 0.11, full line; 0.15, long dashes; 0.20, dot-dashed curve. 
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FIG. 11. log 10 (e 2M5 P l b t L] ^ as function of log 10 £ Data taken from Ref. [3]. Circles: N = 320; points: N = 160. Theoretical 
curves are shown for (c, ci) = (2.05, 0.13), full line: (1.90, 0.11), long dashes; (2.20, 0.11), short dashes. The insert shows the 
results for (c,ci) = (2.15, 0.11) and U{€) = 0.236 r 2A5 (l -i' 50 ). 
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